Acoustic change detection and segment clustering of two-way telephone conversations

نویسندگان

  • Xin Zhong
  • Mark A. Clements
  • Sung Lim
چکیده

We apply the Bayesian information criterion (BIC) to unsupervised segmentation of two-way telephone conversations according to speaker turns, and then proceed to produce homogenous clusters consisting of the resulting segments. Such clustering allows more accurate feature normalization and model adaption for ASR-related tasks. In contrast to similar processing of broadcast data reported in previous work, we can safely assume there are two distinguishable acoustic environments in a call, but new challenges include a much faster changing rate, variation of speaking style by a talker, and presence of crosstalk and non-meaningful sounds. The algorithm is tested on two-speaker telephone conversations with different genders and via different telephony networks (land-line and cellular). Using the purities of segments and final clusters as the performance measure, the BIC-based algorithm approaches the optimal result without requiring an iterative procedure.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Unsupervised Speaker Segmentation Of

A process for segmenting 2-speaker telephone conversations by speaker with no prior speaker models is described and evaluated. The process consists of an initial segmentation using acoustic change and pause detection, segment clustering, and iterative modeling of segment clusters and resegmentation. The technique has been evaluated on 6, approximately 3 min long, customer care conversations. Th...

متن کامل

Unsupervised speaker segmentation of telephone conversations

A process for segmenting 2-speaker telephone conversations by speaker with no prior speaker models is described and evaluated. The process consists of an initial segmentation using acoustic change and pause detection, segment clustering, and iterative modeling of segment clusters and resegmentation. The technique has been evaluated on 6, approximately 3 min long, customer care conversations. Th...

متن کامل

Using acoustic condition clustering to improve acoustic change detection on broadcast news

We have developed a system that breaks input speech into segments using an acoustic similarity measure. The aim is to detect the time points where the acoustic characteristics change, usually due to speaker changes but also resulting from changes in the acoustic environment. We have also developed a system to cluster the segments generated by the first system into clusters composed of homogeneo...

متن کامل

Speaker tracking and detection with multiple speakers

We describe a speaker tracking and detection system, for Switchboard conversations, that uses a two-speaker and silence hidden Markov model (HMM) with a minimum state duration constraint and Gaussian mixture model (GMM) state distributions adapted from a single genderand handset-independent imposter model distribution. Speaker tracking is used to segment speakers for detection, which is carried...

متن کامل

Telephone Conversation Closing Structure Across English and Persian

Due to the lack of paralinguistic information, politeness gains a considerable significance in telephone conversations (TCs). The use of politeness strategies can help interlocutors promote and/or maintain social harmony in telephone interactions. Using the Rapport Management Model proposed by Spencer-Oatey (2008), this study intended to primarily investigate the fundamental closing structures ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003